sentence-state lstm
Sentence-State LSTM for Text Representation – Arxiv Vanity
Hyperparameters: Table 2 shows the development results of various S-LSTM settings, where Time refers to training time per epoch. Adding one additional sentence-level node as described in Section 3.2 does not lead to accuracy improvements, although the number of parameters and decoding time increase accordingly. As a result, we use only 1 sentence-level node for the remaining experiments. The accuracies of S-LSTM increases as the hidden layer size for each node increases from 100 to 300, but does not further increase when the size increases beyond 300. We fix the hidden size to 300 accordingly.